
SEQUENCE LISTING 

( 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Mark Gijzen 

j t- cnprific DNA Pegulatory Region And 
(i i) TITLE OF INVENTION: Seed Coat Specific DNA 

Peroxidase RECEIVED 

(iii) NUMBER OF SEQUENCES: 19 

(iv) COMPUTER READABLE FORM: KB 0^ ZOO! 

(A) MEDIUM TYPE: Floppy disk 
r COMPUTER: IBM PC compatible 
J?! OPERATING SYSTEM: PC : D0S/MS- D 0S 


;C) OPERATING SYSTEM: ^^YlO Version #1.30 (EPO) 
(D) SOFTWARE: Patentln Release #1 . u , 

/ • \ niBRFMT APPLICATION DATA: 
<V1) C (A) "APPLICATION NUMBER: 08/939,905 
(B) FILING DATE: 30-Sept-1996 

(vii) PRIOR APPLICATION DATA: 
( (A) APPLICATION NUMBER: 08/723,4i4 

(B) FILING DATE: 30-Sept-1996 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS^ 
[ ' m tpmcth: 1244 base pairs 

(B) TYPE: nucleic acia 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

HYPOTHETICAL: NO 
(iv) ANTI -SENSE : NO 


:h;h center mo/2m 


(ix) 


(ix) 


(xi) 

ATG GGT 
Met Gly 
1 

ATG CAT 
Met His 


TAC AGA 
Tyr Arg 


FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 1 .. 1056 

FEATURE : . 

(A) NAME /KEY : sig_peptide 

(B) LOCATION: 1 . .78 


SEQUENCE DESCRIPTION: SnQ ID NO : 1: 


TCC ATG CC T CTA TTA GTA GTC GCA ™ TTG TCT GCA phe Ala 

Ser Met Arg Leu Leu Val Val Ala Leu 15 

GCA GGT TTT TCA GTC TCT TAT GOT CAG CTT ACT CCT ACG TTC 

Ala Gly Phe Ser Val Ser Tyr Ala bi ^ 

GAA ACA TGT CCA AAT CTG TTC CCT ATT GTG TTT GGA GTA ATC 
Glu Thr Cys Pro Asn Leu Phe Pro He va ^ 
35 40 


48 


96 


144 


# 


TTC GAT GCT TCT TTC ACC GAT CCC CGA ATC GGG GCC AGT CTC ATG AGG 
Phe Asp Ala Ser Phe Thr Asp Pro Arg He Hy Ala Ser Leu Met Arg 
50 55 60 

TTT CAT GAT TGC TTT GTT CAA GGT TGT GAT GGA TCA GTT TTG 
ill His He His Asp Cys Phe Val Gin Gly Cys Asp Gly Ser Val Leu 
65 70 75 

CTG AAC AAC ACT GAT ACA ATA GAA AGC GAG CAA GAT GCA CTT CCA AAT 
leu ^n A^n Thr Asp Thr He Glu Ser Glu Gin Asp Ala Leu Pro Asn 
85 90 

ATC AAC TCA ATA AGA GGA TTG GAG GTT GTC AAT GAC ATC AAG ACA GCG 
He A^n Ser He Arg Gly Leu Asp Val Val Asn Asp He Lys Thr Ala 
100 105 


GTG GAA AAT AGT TGT CCA GAC ACA GTT TCT TGT GCT GAT ATT CTT GCT 
vll Git A^n Ser Cys Pro Asp Thr Val Ser Cys Ala Asp He Leu Ala 
115 120 125 

ATT GCA GCT GAA ATA GCT TCT GTT CTG GGA GGA GGT CCA GGA TGG CCA 
He Ala All Git He Ala Ser Val Leu Gly Gly Gly Pro Gly Trp Pro 
130 135 I 40 

GTT CCA TTA GGA AGA AGG GAC AGC TTA ACA GCA AAC CGA ACC CTT GCA 
Val Pro Leu Gly Arg Arg Asp Ser Leu Thr Ala Asn Arg Thr Leu Ala 
]_45 150 l^- 3 

AAT CAA AAC CTT CCA GCA CCT TTC TTC AAC CTC ACT CAA CTT AAA GCT 
Ifn Git Atn Leu Pro Ala Pro Phe Phe Asn Leu Thr Gin Leu Lys Ala 


Trr TTT GCT GTT CAA GGT CTC AAC ACC CTT GAT TTA GTT ACA CTC TCA 
Ser Phe Ala Val Git Gly Leu Asn Thr Leu Asp Leu Val Thr Leu Ser 
180 185 190 

CGT GGT CAT ACG TTT GGA AGA GCT CGG TGC AGT ACA TTC ATA AAC CGA 
Gly Gly Sis £hr Phe Gly Arg Ala Arg Cys Ser Thr Phe He Asn Arg 
195 200 205 

TTA TAG AAC TTC AGC AAC ACT GGA AAC CCT GAT CCA ACT CTG AAC ACA 
leu Tyr aIh Phe Ser Asn Thr Gly Asn Pro Asp Pro Thr Leu Asn Thr 
210 215 220 

ACA TAC TTA GAA GTA TTG CGT GCA AGA TGC CCC CAG AAT GCA ACT GGG 
Tnr* Tyr Leu Glu Val Leu Arg Ala Arg Cys Pro Gin Asn Ala Thr Gly 
225 230 235 Z 

GAT AAC CTC ACC AAT TTG GAC CTG AGC ACA CCT GAT CAA TTT GAC AAC 
Asp Ifn lei tII Asn Leu Asp Leu Ser Thr Pro Asp Gin Phe Asp Asn 
- ■ - 250 ^_>~> 


245 


APA TAC TAC TCC AAT CTT CTG CAG CTC AAT GGC TTA CTT CAG AGT GAC 
Arg Tyr Tyr Ser Asn Leu Leu Gin Leu Asn Gly Leu Leu Gin Ser Asp 


260 


PAA GAA CTT TTC TCC ACT CCT GGT GCT GAT ACC ATT CCC ATT GTC AAT 
Git Git L^ lie Ser Thr Pro Gly Ala Asp Thr He Pro He Val Asn 


275 


ACC TTC AGC AGT AAC CAG AAT ACT TTC TTT TCC AAC TTT AGA GTT TCA 
Ser Phe Ser Ser ^n Gin Asn Thr Phe Phe Ser Asn Phe Arg Val Ser 
290 295 30° 


192 


240 


288 


336 


384 


432 


480 


528 


576 


624 


672 


720 


768 


816 


364 


912 


m ™ r r rp r AC ^ GG g GAT GAA GGA GAA 

£ ^ -V ^ ^ f r 01, Asp Glu dy Glu 

ATT CGC TTG GAA TOT AAT TTT GTG AAT GGA GAC TC ? Phe S £S SI 

He Arg Leu Gin Cys Asn Phe Vai Asn oiy y 335 

- GTG GGG TCG ^ GAT OCT £ C» AAG GTT GTT GOT GAA TCT AAA 

Ser Val Ala Ser Lys Asp Ala Lys bin uys ^ 
340 i4b 


TAAACCAATA ATTAATGGGG AT GTGCATGC TAGCTAGCAT GTAAAGGCAA ATTAGGTTGT 
AAACCTCTTT GCTAGCTATA TTGAAATAAA CCAAAGGAGT AGTGTGCATG TCAATTCGAT 
TTTGCCATGT ACCTCTTGGA atattatgta ataattattt gaatgtgttt AAGGTAGTTA 


960 

1008 

1056 

1116 
1176 
1236 
1244 


ATTAATCA 


(2) INFORMATION FOR SEC ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: _ 

(A) LENGTH: 4700 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL : NO 
(iv) ANTI-SENSE: NO 


(ix) FEATURE: 

(A) NAME / KEY : promoter 

(B) LOCATION: 1. .1532 

( ix ) FEATURE : 

(A) NAME /KEY : sig_peptiae 

(B) LOCATION: 1533. .1610 

(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION : 1533 . .17 51 

(ix) FEATURE: 

(A) NAME/ KEY : exon 

(B) LOCATION: 2383 . .2574 

(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 3605 . . 3769 

(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 4033 . .4515 


(ix) FEATURE: 

(A) NAME/KEY: mtron 

(B) LOCATION: 17 52 . .2382 


(ix) FEATURE : 

(A) NAME /KEY: intron 

(B) LOCATIOII: 2575 . . 3604 

(IX) FEATURE: 

(A) NAME /KEY : intron 

(B) LOCATION: 3770. .4032 

(IX) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1533 .. 1751 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 2383 . .2574 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 3 605. .3769 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION:4033 . .4516 


(xi) SEQUENCE DES 
TAGATAAAAA AATGGGATAT 
AATTAAAATT CCTCTTTAAT 
TATTTAATAC AAATTTTTAT 
AGTACGAAAA CATAAAAAAA. 
CATATATTAG CTAAATTAGT 
ATCTCACCTT TTTCATTTAA 
TTCTTCGATA AAC C AT G AAA 
CATGTATGGC TAGTATGGGC 
ATAGATGTGA CTTTTGTTGA 
CTAATTTGGA GAATTTGAAT 
TCAAATTTGT AC CATC ATT A 
TTTCTTACAT TATCATATTA 
TTTAAAAAGT CATACATGCA 
AAATGCATGA AAATTAAACT 
TGATTATTTT TTGCAAATGA 
TTATGGTGTC AATGTTCCAA 
ATTTTAAACT TATCTTTACG 
CTTTTGTTTT TGTGTTAAAA 


RIPTION: SEQ ID NO: 2: 
AATTTTTCTC AGATGTTGTT 


TATCGACATA Hiiinii'u 


TGTACATAGA 
CTGTTATTAG 
TGTTCTAATT 
ATACATTTCT 
TTTAACATGG 
AGCCAAAATT 
GGAACTCATG 
TATGATCATT 
TTTCCCAAAA 


AATAATTTTT 
TTATTTTTCC 
ATGTTTATTG 
AACCTAATGC 
CAAGAGATAT 
AACAGTAACA 


AGTGATACTT 
AAGAAAAAAA 
GGCTATATAA 
ACTTTTTAAG 
TATATCAGCG 
TGCCCTGGTT 
CCAATGGTAC 
AAATACTCCT 
ATTTGATTAC 
CATTTTGTTT 
TAATAGTTTA 
AAGTCATCAT 
AACATTTAAA 
AAGATCTTAG 
AAAGATTATA 
TTTTCTTAAT 


TATACTGTTT 
GTGAATATTA 
CAATTTTAAT 
TATATGGAAA 
ACCCTATTGT 
TTCTATATTT 
ATACCACCCA 
CAAGCAAAGC 
TGATTGTGAA 
CTCCTGACTA 
AATGCACTAA 
TTACTTTTTA 
CAGTTAAATT 
TTAGTCAAAT 
TGTAGCCTAA 
CAAGTACATA 
CATCTAGTTT 
TTTGTAGAGT 


TTTTAATCAG 
TCGACATAAT 
ATTGGAGAAC 
AGGTTAGCTA 
ACTCTTTGTA 
TCTCTCAATT 
CTTTGAAAGC 
AAGTGTTTAT 
AC TGAG AAAA 
CCTTCGTCCC 
TTAATGAATG 
TAATAATTAT 
TTTACAGTAA 
CCCAAAACAA 
TTAATTCTGG 
CATAGATCTA 
TAAACATTAA 
GACGTGCTCC 


60 

120 

180 

240 

300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 


^AACATTT TAATTGGTAT TCAAGTTCAT GAACTTAGTA AATAAGTTTT 
AAC CAT ATT A AC GAAGATT 1 i AA l i * 

^ _. c ,,,, ttt ATGTAAA ATATCAACGT TTTCTGAAAT 
GGTCTTCAGT TTTCAATTTT UunCnnJ 

^TGTTGCTTG TGTGCTCCAA GCACATTTAA GAGATTATAG AAATTAATTT TCAAGAAGAT 
AATGATTCCT ACTGTTGCTG GCCCTAGCAT AGTACAATAA ATGGACTCAT AAATCAACAA 
GTCGTCGTCA TAGGCAATTG GGCATCATAT CATAAACAAT ACGTACGTGA TATTATCTAG 
TGT< 'TCTCAG TTTACTTTAT GAGAAATTAT TTTTCTTTAA AAAAAGTTAA TTAATAAAAA 

^ r ,_„„, aqlTr ATCTCTATAA ATAAAAGGAT 
CATTTGCGAT ACCGTGAGTT AC AAG AAATC CG^G«ATTL ATCTC 1 

CTATATGAGA GGTAAAATCA TATTAACTCA AA ATG GGT TCC ATG CGT CTA TTA 

1 5 

GTA GTG GCA TTG TTG TGT CGA TTT GOT ATG GAT GCA GGT TTT TGA CTC 

Val Val Ala Leu Leu Cys Ala Phe Ala Met ^ 

E £ SI SS HI 55 f £ SS S £ Si - - - - 
2S S S !5 - S S5 SS SS S2 2S S S S£ S SS 

40 45 

„„, m /-<»m r-A^p rp^f- TTT 

£ «; i5 Sg E 55 SS SS E E - Si s; - s; *. 

GTT CAA GTACGTACtI TTTTTTTTGC TTCCAAAATG CCCTGGATAT TTAACAAGAT 

Val Gin 


TGCTTTGTTC 
TGAAAAATAA 
AAGGTATTTA 
TTGAATGATA 
GAAGAAAAAA 
TAATTAATAC 
GACAAGTATT 
GTGTATAGAT 
TACATTGATT 
TGATCTGAAC 


AC C T AG AAAA 
ATCAGAAAGA 
GTGTGAGAAA 
TTTACATGTC 
GATGTCTTTC 
TATATATCTA 
CTAAAGAGGT 
ATTCTTTTAT 
AACTAATAGC 
AAATTAAGTT 


ATGTGTTTTT 
GATCAAGAAA 
AATATTAAAA 
TTATTAACTT 
AGTTTAGTTT 
TTTACCATAT 
ATCGGTAGAT 
AATTGGTGCA 
TATAATCAAT 
GTTATATTTG 


TTCAACGATC 
ATAGCTAGAA 
CTGAAGAGAA 
AAAGTCACCT 
TGATTAATGC 
TAATTATTAC 


Gat bniim^" 


GAAACTTGTA 
ATTTAGGTTA 
CATTGTGACA 


TTACGTACGT 
AGAAAGCAAC 
AGAAATTAAA 
TTTTTCTTTA 
TAATTATATT 
TATATTTCAT 
TTTATAAAAA 
ATGCTAATTG 
GGTATAGGAG 


TTGTTTGGTT 
GTTTTTTTAA 
TAAGCTTTTC 
AGTTGTGCTT 
TTTAATTAAT 
GATGACAACA 
AATCTTTTGC 
CAATTAATCT 
ACAAATCAAG 


s ss in in £ £ £ *| - - - 

80 8b 


G GGT TGT GAT GGA 
Gly Cys Asp Gly 
75 

AGC GAG CAA GAT GCA 
Ser Glu Gin Asp Ala 
90 


1140 

1200 

1260 

1320 

1380 

1440 

1500 

1553 

1601 

1649 

1697 

1745 

1801 

1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2394 

2442 


^-a ttp GAC GTT GTC AAT GAC ATC 

CTT CCA AAT ATC AAC TCA ATA AGA GA TIG GAG ^ ^ ^ 

Leu Pro Asn lie Asn Ser lie Arg ,iy iQ5 


95 100 


AAG AC A GCG GTG GAA AAT ACT TGT CCA GAC ACA GTT TGT TGT OCT GAT 

Lys Thr Ala Val Glu Asn Ser Cys Pro Asp 125 
110 115 

™* AmA rr-T TPT GTT CTG GTAATTAATA 

^ til ?f. Sa SI S£ " All Sal - 

130 iJ5 


140 145 


_ r „ m „™ aat 1 a a AC CTT CCA GCA CCT TTC TTC 
R T S Sj ifn £S - - Sn -u Pro ,1a Pro Phe Phe 

AAC CTC ACT CAA CTT AAA GOT TCC TTT CCT CTT CAA GGT CTC AAC ACC 

Asn Leu Thr Gin Leu Lys Ala Ser Phe Ala vax lg5 


170 175 


2490 


253! 


2584 


CTT GAT TTA GTT ACA CTC TCA GGTATACATA ATCAATTTTT TATTTGCTAT 
Leu Asp Leu Val Thr Leu Ser 

TAGCTAGCAA TAAAA^TCT CTGATACAGA CATATTTACA TAAATTAATT TCTCCATAAA 


!644 

:;704 

2764 

2824 

2884 

2944 

3004 

3064 

3124 


ACTCCTAATT AATTCCCAAC CATTAAAAAC TTGCATGATT GGATTCAAAA TTCTATGGTA 
TTGGGGTTCT GATATAAATT TGTAATTAAA TTGCACTAAA AAAAATTATC ATATACTTTT 
AATAAAAAAA ATTTATCTAA TTTAATTTAT TATTAAAACT ATTTTTAAAA TTCAATCCTA 
ACTCTTTTTT AATCGGAGCA TGTAAGCTGG CACCCACCGT ATATCGTTGG AAGATGCTAT 
AAAACCATTT AATTAATGGA TGGAATCAGT CAAAACATTT AATTCAAAAT ACTCTTAATT 
GTGATTAGTA ATCATGTTCG GGCAAGTTAC GTTGTGTATA ATTAATTTGA CTTAATCAGA 
TAAAAAAACA AATGGACGCA AGCCGGTTGG TATAGATATC ACTGGCCTGT AGAATATGTG 
GTTTTTCACG TTTAAATAAA AGCTAGCTAC TATATTATAT TTAGTCTTTT TTTTTCTTAA 
ACCCATTTAA CGTGATTTAT TGACTGTGAA ACATGTTTCC ACACACAGGC TTAGAAACTC 

_ ~_ m .m~ ™^™zvTTr ATCTATGATG 3184 

CTCGCAACTA ACATCTCCAA AATTTGACTA 1T1«."~» ~~ 

TTCAACTCTA TTATATATAT GTATCATCGC AGTATTAAGA ATTATAATAG TCAAATATAG "» 
AAGTATATCG GGTAAATGTA GTTGCATGTG CGACCTGTTT CGTGTAAAAT GCTTATTCTA 
TATAGCTTTT TTTATTGGAA AATAACGATG AACTAAAAAC GAAAGGGTAT CATATAGTTT 
GACTTTTATG TTAGAGAGAG ACATCTTAAT TTGGTCATAT GTTAAATAAT TAATTACAAT 
GCATACACAA ATATTTATGC CATATCTAAA AAATGATAAA ATATCATAGG TATACTCAAC 
TATATGATAT CCCCATAACA GAAATTGTAC TTTTCTTCAG GCAATGAACT TAACATTTCT 
GTTTGCTAAA AACAAACATC CACTTAAAGT GGTTCAACAT ATTTATCTAA TAATTTACAC 

nr, rrA C-T CCA TTA GGA AGA AGG GAC AGC TTA 

S S So gS 1% So - - ^ *™ ^ Ser Leu 


3244 
3304 
3364 
3424 
3484 
3544 
3604 
3652 


3700 


3748 


3799 


3859 


a™ "\TTTATGTAC TTAAAAATTA TGGATTGAAG CTCTTTTCAT 
P ATTTATAAT AAAATTATCA ATTT ATGT. AC 

CAT _ rr , x »™ A — AAATA AACTATCTCT TGTTTCTTAT 
PCAACTTTTA CTAAAGTTAA GGTGC^^ A - 

AAAAAGATTG M,^ .—TACT TATAAATCAT TAATATATGT ATA - 


- SS £ £ S5 £ S £ - S - 

- S S 25 iS i - E SS S£ S 

2 TTA GAA CTA TTG OST GCA ACA TGC CCC CAG 

Tyr Leu <j1u Vai ^eu ^j-y 235 
23 0 

£ S 5£ i S £ SS S S S K 
£ £ S S S 22 S - S S 

GM CTT ™ TCC ACT GOT GGT GCT GAT AGC ATT 

Glu Leu me ^ * — --- ^ 

TTC Z AGT AAC C f AAT ACT TTC TTT TCC AAC 

Phe Ser Ser Asn Gin Asn inr 3Q0 

-I AAA ATG GGT AAT ATT GGA GTG CTG ACT ^ 

He Lys Met Gly Asn lie Gly vai 
1 * 310 

™^ r aJ TCT z^t TTT GTG AAT GGA GAC TCG 
Z III G fn Cys "n Phe Va! A S „ Gly Asp Ser 

GTG GCG TCC AAA GAT GCT AAA CAA AAG CTT GTT 

Val Ala Ser Lys Asp Ala Lys Gin by 

340 ^ 
AC C AAT AAT T AATGGGGATG TGCATGCTAG CTAGCATGTA 
CCTCTTTGCT AGCTATATTG AAATAAACCA AAGGAGTAGT 
GCCATGTACC TCTTGGAATA TTATGTAATA ATTATTTGAA 
AATCA 

(2) INFORMATION FOP. SEQ ID NO: 3 : 
(i) SEQUENCE CHARACTERISTICS. 
1 (A) LENGTH: 17 base paxrs 

(B) TYPE: nucleic acid 

(C) STRAIIDEDNESS : single 

(D) TOPOLOGY: linear 


TTC ATA AAC CGA TTA 
Phe He Asn Arg Leu 
205 

P^qt, CTG AAC ACA ACA 
Thr Leu Asn Thr Thr 
225 

AAT GCA ACT GGG GAT 
Asn Ala Thr Gly Asp 
240 

CAA TTT GAC AAC AGA 
Gin Phe Asp Asn Arg 
255 


CTT CAG AGT GAC CAA 
Leu Gin Ser Asp Gin 
270 

CrC ATT GTC AAT AGC 
Pro He Val Asn Ser 
285 

TTT AGA GTT TCA ATG 
Phe Arg Val Ser Met 

GAT GAA GGA G^ An 
Asp Glu Gly Glu He 
320 


TTT GGA TTA GCT AGT 
Phe Gly Leu Ala Ser 
335 

GCT CAA TCT AAA TAA 
Ala Gin Ser Lys * 
350 

AAGGCAAATT AGGTTGTAAA 
GTGCATGTCA ATTCGATTTT 
TCTCTTTAAG GTACTTAATT 


3919 
3979 
4035 

4083 

4131 

4179 

4227 

4275 

4323 

4371 

4419 

4467 
4515 

4575 
4635 

r. r r\ c 
4700 


(ii) MOLECULE TYPE: 


DNA 


(XL) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TTYCAYGAYT GYTTYGT 

(2 ) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
U) (A) LENGTH : 20 base pairs 

IB) TYPE: nucleic acid 

(C) STRANDEDHESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

(xi , SEQUENCE DESCRIPTION : SEQ ID NO: 4: 
CTTCCAAATA TCAACTCAAT 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i ) SEQUENCE CHARACTERISTICS: 

1 ' (A) LENGTH : 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

( ii) MOLECULE TYPE: DNA 

(xl ) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TAAAGTTGGA AAAGAAAGTA 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
ATGCATGCAG GTTTTTCAGT 

INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
TTGCTCGCTT TCTATTGTAT 
(2) INFORMATION FOR SEQ ID NO: 8: 


/ 1 


d) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

{xi , SEQUENCE DESCRIPTION: SEQ ID NO: 8: ^ 
TCTTCGATGC TTCTTTCACC 
(2) INFORMATION FOR SEQ ID NO: 9: 

(1 ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

( X1 ) SEQUENCE DESCRIPTION: SEQ ID NO: 9: ^ 
CATAAACAAT ACGTACGTGA T 
(2) INFORMATION FOR SEQ ID NO: 10: 

^^^TTn-^nv THaPACTERlSTICS: 
U ' °(XrLEN6TH:" l"631 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xl ) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TTTCATGATT GCTTTGTTCA AGGTTGTGAT GGATCAGTTT TACTGAACAA CACTGATACA 
ATAGAAAGCG AGCAAGATGC ACTTCCAAAT ATCAACTCAA TAAGAGGATT GGACGTTGTC 
AATGACATCA AGACAGCGGT GGAAAATAGT TGTCCAGACA CAGTTTCTTG TGCTGATATT 18 
l TT G CAGCTGAAAT AGCTTCTGTT GCTGGGAGGA GGTCAGGATG GCCAGTTCCA M 0 
Igaa gggagagg. aacagcaaac ggaagcg™ CAAATCAAAA CCTTCCAGCA 300 
™ W A ACCTCACTCA ACTTAAAGCT T G CT ™c T G TT CAAGG TCT CAACACGC TT 3 6 

I;™;™ «»™ c— * T *»™™ « 

AACCGATTAT ACAACTTCAG CAACACTGGA C.GA.CGAC, TGGACACAAC ATACTTAGAA 
GTATTGCGTG CAAGATGCCC CCAGAA.TGCA ACTGGGGATA ACCTCACCAA TTTGGACCTG 
AGCACACCTG ATCAATTTGA CAACAGATAC .AC.CCAA.G TTCTGCAGCT CAATGGCTTA S0O 

cttcagagtg acgaagaacg TTT c T cgac T gg T gg T gg T g a T agca TT gc « « 

GCTTCAGCGA ACCAGAATAC TTTCTTTTCC AACTTTAGAG TTTCAATGAT AAAAATGGGT 72 

aatattggag T gc T gac T gg ggatgaagga gaaa TT cgc T T gcaa TCT aa TTTT g T gaa T .0 


ggagactcgt ttggattagc tagtgtggcg TCGAAAGATG gtaaacaaaa GC TT G TTGCT 

CAATCTAAAT AAACCAATAA TTAATGGGGA TGTCGATGCT AGCTACGATG TAAAGGCAAA 

ttaggttgaa acgtg TTT gg tagctatatt gaaataaacg aaaggagtag tgtccatgtg 9 so 

AATTCGATTT TGCGATGTAC CTCTTGGAAT ATTATGTAAT AA-ATTTGA ATGTGAAAAA 102 0 

1031 

AAAAAAAAAA A 

(2) INFORMATION FOR SEQ ID NO: 11: 

(!) SEQUENCE CHARACTERISTICS^ 

(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 


(xi) SEQUENCE DESC 
GGCAAACAAT GAACTCCCTT 
TTGGAGGGTT ACCCTTCTCT 
^nr-MiTOT TAGTTCCATT 


GTATGCTTGC 
CATCAGTTTT 
GAAACTCATT 
GTCCTAACAC 
TGGCAGATGG 
AGTTACTTGC 
CATTTGCTGC 
TTGGAAGAGC 
GTCCCGATCC 
GTGGACCTGG 
ACTATTACTC 
CAACATCTGG 
TTTTTGAGAG 
ACCAAGGAGA 
TTATCAATGT 
AGTGATTGGA 
GAGATAGTTA 


TAGTCTTGTC 


AC T AAAC AAA 


AGTTTCTTGT 
TCCTGACTGG 
TAATCAAAAT 
TCAAGGTCTC 
TCATTGCTCT 
AACTCTTAAC 
CACGAACCTT 
TAATCTTCAA 
TTCAGATACC 
CTTTAGGGCT 
GATTAGAAAA 
TGCCTCAGCA 
AGCAACTAAT 
TTAGATGCTT 


RIPTION: SEQ ID NO: 11: 
CGTGCTGTAG CAATAGCTTT GTGCTGTATT 
TCAAATGCGC AACTTGATCC ATCCTTTTAC 
GTTCGTGAAG TCATAAGGAG TGTTTCTAAG 
AGGCTTCACT TTCATGACTG TTTTGTTCAA 
ACTGATACCG TTGTGAGTGA ACAAGATGCT 
GATGTTGTGA ATCAAATCAA AACAGCTGTG 
GCTGATATTC TTGCTCTTTC TGCTGAATTA 
AAGGTTCCTT TAGGAAGAAG AGATGGTTTA 
CTTCCAGCTC CTTTCAATAC TACTGATCAA 
GATACTACTG ATCTGGTTGC ACTCTCCGGT 
TTATTTGTTA GCCGATTGTA CAACTTCAGC 
ACAACTTACT TACAACAATT GCGCACAATA 
ACCAATTTCG ATCCAACGAC TCCTGATAAA 
GTGAAAAAAG GTTTGCTTCA AAGTGATCAA 
ATTAGCATTG TCAACAAATT CGCAACCGAT 
GCTATGATCA AAATGGGAAA TATTGGTGTG 
CAATGCAACT TTGTTAATTC AAAATCAGCA 
GATTCATCTG AGGAGGGTAT GGTTAGCTCA 
AAATTAAGAA GCTATAACTA TGCACATTCA 
TGTGAGCAAA AATCTTTTGG ATTTCATTTG 


GTGGTTGTGC 60 
AGGAACACTT 12 0 
AAAGATCCTC 18 0 
GGTTGTGATG 240 
TTTCCAAACA 3 00 
GAAAAGGCTT 3 60 
TCATCTACAC 42 0 
ACGGCAAACC 480 
CTTAAAGCTG 540 
GCTCATACAT 600 
GGTACGGGAA 6 60 
TGTCCCAATG 72 0 
TTTGACAAGA 780 
GAGTTGTTCT 840 
CAAAAAGCTT 900 
TTAACCGGGA 96 0 
GAACTTGGTC 102 0 
ATGTAAATGT 10 80 
TGGTATGTGT 1140 
AAGTGTTTCT 12 00 


(2) INFORMATION FOP SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(H) MOLECULE TYPE: cDIIA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 


GCTCTTCAAA 
GCTTGGAGGA 
GTGTCCAACT 
TCGCATGCTT 
TGCCTCAGTT 
TAACAACTCT 
TTGTCCTAAC 
TCTGGCACAA 
CCGAACACTT 
TGCATTTACT 
ATTTGGAAGA 
AAGTCCCGAT 
TGGTGGACCT 
GAACTATTAC 
CTCAACTTCT 
TTTCTTTGAG 
GACAAAAGGA 
AGAACTAGAT 
TGTAATATAA 
AT AAAT AAGT 


ACAATGAACT 
CTACCCTTTT 
GTTAGTTCCA 
GCTAGTCTCG 
TTGCTGAACA 
CTAAGAGGTT 
ACAGTTTCTT 
GGTCCTAGTT 
GCAAATCAAA 
GCTCAAGGCC 
GCTCATTGCG 
CCAACTCTTA 
GGCACAAACC 
TCCAATCTTC 
GGTGCAGATA 
AGCTTTAAGG 
GAGATTAGAA 
TTAGCCACCA 
ATAAATTAGC 
TATAACTAGG 


CCTTAGCAAC 
CCTCAGATGC 
TTGTTAGCAA 
TCAGGCTTCA 
ATACTGCTAC 
TGGATGTTGT 
GTGCTGATAT 
GGACGGTTCC 
ATCTTCCGGC 
TCAATACTAC 
CACAATTTGT 
ACACAACTTA 
TTACCAATTT 
AAGTGAAAAA 
CCATTAGCAT 
CTGCAATGAT 
AACAATGCAA 
TAGCATCCAT 
GTAAATGCAC 
CACATTTCAT 


TTCTATGTGG 
ACAACTTAGT 
TGTCTTAACA 
CTTTCATGAC 
AATCGTAAGC 
GAATCAGATC 
TCTTGCACTT 
TTTAGGAAGA 
TCCATTCAAT 
TGATCTAGTT 
TAGTCGATTG 
CTTACAACAA 
CGATCCAACG 
GGGTTTGCTC 
TGTCAACAAA 
TAAAATGGGC 
CTTTGTGAAC 


TGTGTTGTGC 
CCCACTTTTT 
AACGTTTCTA 
TGTTTTGTTC 
GAACAACAAG 
AAACTGGCTG 
GCTGCTCAAG 


n prm\ n a ATP a 


TTATTGAAAT 
GTCACTTGAA 


TCCTTGGATC 
GCACTCTCGG 
TACAACTTCA 
CTGCGCACAA 
ACTCCTGATA 
CAAAGTGATC 
TTCAGCACCG 
AATATTGGTG 
TTTGTGAACT 
TTAGAGGATG 
CTTGTGACTA 
ATTTCATGCC 


TTTTAGTTGT 60 
ACAGCAAAAC 12 0 
AGACAGATCC 180 
TGGGATGTGA 240 
CTTTTCCAAA 300 
TAGAAGTGCC 360 
CATCCTCTGT 42 0 
TAACCGCAAA 480 
AACTTAAAGC 54 0 
GTGCTCATAC 60 0 
GCAGTACTGG 660 
TATGTCCCAA 720 
AATTTGACAA 7 80 
AAGAGTTGTT 840 
ATCAAAATGC 90 0 
TGCTAACAGG 960 
CAAATTCTGC 102 0 
GTATTGCTAG 1080 
GATGCCACTA 1140 
TGTATATGAG 12 00 


(2) INFORMATION FOR SEQ ID NO: 13: 

(!) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 


(iil MOLECULE TYPE : cDNA 

SEQUENCE DESCRIPTION: SEQ ID t.O: 13: 
CTCCTTAGCA ACTTCTATCT CGTGTGTTCT GCTTTTAGTT CTGCTTGGAG 
TTCCTCAGAT CCACAACTTA GTCCCACTTT TTACAGCAAA ACGTGTCCAA 

cattgttagg aa.gtct.aa gaaacgtttg taagacagat cgtcgcatgc 

CGTCAGCCTT CACTTTCATG ACTGTTTTGT TCTGGGATGT GATGGCTCAG 
CAATACTCCT AGAATCGTAA GCGAACAACA agcttttcca AATAACAACT 
TTTGGATGTT GTGAATCAGA TCAAAACTGC TGTAGAAAGT GCTTGTCCTA 

„„„„ P ATPPTrTGTT CTGGCACAAG 
TTGTGCTGAT ATTCTTGCAC TTGCTCAAGC ATCCT.TG 

,«..«vm AACCGCAAAC CGAACACTTG 
GACGGTTCCT TTAGGAAGAA GCGATgGTTT AACCL 

TCTTCGGGCT CCATTCAATT CCTTGGATCA CCTTAAACTG CATTTGAGTG 

CATTACTCCT gttctagttg ccgtgtcggg tgctcataga tttggaagag 

ACAATTTGTT AGTCGATTGT ACAACTTCAG CAGTACTGGA AGTCGCGATC 

TrrrGACAAT ATGTCCCAAT GGTGGACCTG 
CACAACTTAC TTACAACAAC TGCGCACAAl a 

,.,^,ATAA ATTTGACAAG AACTATTACT 
TACCAATTTC GATCCAAi^a 

agtgaaaaag ggtttgctcc aaagtgatga agagttgttc tcaacttctg 

CATTAGGATT GTGGAGAAAT TCAGCACCGA TCAAAATGCT TTGTTTGAGA 

pr^AACGGG ACAAAAGGAG 
TGCAATGATT AAAATGGGCA ATATTG^G, GC.AM... 

ACAATGCAAC TTTGTGAACT CAAATTCTGC AGAACTAGAT TTAGGCACCA 
AGTAGAATCA TTAGAGGATG GAATTGCTAG TGTAATATAA ATAAATTAGC 
TTATTGAAAT CTTGTGACTA GATCCCACTA ATAAATAAGT TATAAGTAGG 
GTCAC"°TGAA ATCCTATGCC TTGTATATTA GAGCACGTGT TCTTCTTGGT 


GACTACCCTT 60 
CTGTTAGTTC 12 0 
TTGCTAGTCT 180 
TTTTGCTGAA 240 
CTCTAAGGGG 3 00 
ACACAGTTTC 360 
GTCCTAGTTG 42 0 
CAAATCAAAA 480 
CTCAAGGCCT 540 
CTCATTGCGC 600 
CAACTCTTAA 6 60 
GCACAAACCT 72 0 
CCAATCTTCA 7 80 
GTGCAGATAC 840 
GCTTTAAGGC 9 00 
AGATTAGAAA 960 
TAGCATCCAT 102 0 
GAAAATGCAC 10 8 0 
CACATTTCAT 114 0 
ATTATACTAT 1200 


(2 ) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS : 
U) b .7 ,™oth- 1 2 00 base pa 


Ai LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

{xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

raA OAGOTTTTTG CTGTATGGTG TTTGTGCTAA TTGGAGGAGT 60 
AATGCTTGGT CTAAGTGCAA CA^CI ^ ctaatcttga ^ 

ACCCTTTTCA AATGCACAAC TAGATCCTTC TCCTTGGTAG 180 

rPTCGTGTGC TCACAAATGT TTCACAATCT GATCCCAGAA TCCTTGGTAG 
TTCAATCGTA CGTGGTG1GL il.au«^ 


# 


^ CTACATTTTC ATGACTGTTT .C.TCAAOC, TGCCATGCCT CGATTTTGCT 

GAACGATACG gc.acaa.ag TCACCCAGCA aagtgcacca ccaa.^ca ac.cca.aa , 

AGGTTTGGAT GTGATAAACC AGATCAAAAC AGCGGTGGAA AA.GC.G.C C.AACACAG. 3 6 0 
.Lee. GA.A..C..G C.C...C.GC TGAAATATCA .C.GA.C.GG CAAA.GG.CC .0 
TACTTGGCAA GTTCCATTAG GAAGAAGGGA TAGTTTGACA GCAAATAATT CCCTTGCAGC ,8 
TCAAAATCTT CG.GCCGCCA CTTTCAACCT TACTCGACTA AAATCTAACT TTGATAATCA 
^ACC.CAGT ACTACTGATC ^GCAC, CCCAGGTCCC CATACAATTG GAAGACCTCA 
ATGCAGATTT TTCGTTGATC GATTATACAA TTTCAGCAAC ACTGGAAACC CCGATTCAAC 
TCTTAACACG ACCTATTTAC AAACATTGCA AGCAATATGT CCCAATGGTG GACCTGGTAC 
^ACCTAACC GATTTGGACC CAACCACACC AGATACATTT GACTCCAACT ACTACTCCAA 780 

fTTTTTTCCA GAAATGGTTC 840 
TCTCCAAGTT GGAAAGGGCT TGTTTCAGAG TGACCAAGAG CTTTTTTCC 

TGACACTATT TCTATTGTCA ATAGTTTCGC CAATAATCAA AG.CG^CX TTGAAAATTT ,0 
TGTAGCCTCA ATGATAAAAA TGGGTAATAT TGGAGTTTTA ACTCCA.CTC AAGGTGAAAT 
TAGAACACAG TGTAATGCTG TGAATGGGAA TTCTTCTGGA TTGGCTACTG TAGTCACCAA 
„. ,„„.™»» TGGCTAGCTC ATTCTAAATA TAAGCTTGGA AAATATTGAA 1030 

AUAA'iLft 

GAGGTTCTAT AATTTTGTGC ATACATATAT GGTATGTGCA " 
TTATGTTCTT CAAGTTGATC AGGGAC TGTA GAAGCTCCCT AATAATATTT GTGTCAAAGT 1200 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 283 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: prt 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
. HDCF ,OGCD GSVLLNNTDT IESEQDALPN INSIRGLDW NDIKTAVENS CPDTVSCADI 60 
^iaaeLsv agrrsgwpvp lgrrosltan RTLANQNLPA PFFNLTQLF, SFAVQGLNTL 1,0 
DLVTLSGGHT SGRARCSTFI NRLYNFSNTG LIHLDTTYLE VLFARCPQNA TGDNLTNLDL 
STPDQFDNRY YSNLLQLNGL LQSDQERFST PGADTIPLSI ASANQNTFFS NFRVSMIKMG ,40 
NIGVLTGDEG EIRLQCNFVN GDSFGLASVA SKDAKQKLVA QSK 

(2) INFORMATION FOR SEQ ID NO: 16: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 55 


(B) TYPE: amino acid 

(C] STRANDEDNESS : 

(Dl TOPOLOGY: linear 

(ii) MOLECULE TYPE: PRT 


, xll SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
MNSLRAVAIA ECCIWVLGC LPFSSNAQLD PSFYPNTCPN VSS1VREVIR SVSKKDPRML ,0 
ASLVRLHFHD CFVQGCDASV LLNKTDTWS EQDAFPNRNS LRGLDVW KTAVEKACPN 120 
TVSCADILAL SAELSSTLAD GPDWKVPLGR RDGLTANQLL ANQNLPAPFN TTDQLKAAFA 180 
AQGLDTTDLV ALSGAHTFGP. AHCSLFVSRL YNFSGTGSPD PTLNTTYLQQ LRTICPNGGP »0 

gtnltnfdpt tpdkpdknyy snlovkkgll QSDQELFSTS GSDTISIVNK patdqkaffe 300 

SFRAAMIKMG NIGVLTGNQG EIRKQCNF™ SKSAELGLIN VASADSSEEG NVSSM 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 58 

(B) TYPE: amino acid 

( C ) STPANDEDNESS : 

(D) TOPOLOGY: linear 


11 


MOLECULE TYPE: PRT 


(-i) S^QUENPE DESCRIPTION: SEQ ID NO: 17: 

MNSLATSMWC WLLWLGGL PFSSDAQLSP TFYSKTCPTV SSIVSNVLTN VSKTDPRKEA 60 
SLVRLHFHDC FVLGCDASVL LNNTATIVSE QQAFPNNNSL RGLDWNQIK LAVEVPCPNT 

VSCADILALA AQASSVLAQG PSWTVPLGRR DGLTANRTLA NQNLPAPFNS LDQLFAAFTA 180 

QGLNTTDLVA LSGAHTFGRA HCAQFVSRLY NESSTGSPDP TLNTTYLQQL RTICPNGGPG 240 

TNLTNFDPTT PDKFDKNYYS NLQVKKGLLQ SDQELFSTSG ADTISIVNKF STDQNAFFES 3O0 

FF.AAMIKMGN IGVLTGTKGE IRKQCNFVNF VNSNSAELDL ATIASIVESL EDGIASVI 358 

(2 ) INFORMATION FOR SEQ ID NO : 18: 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 347 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: PRT 


355 


ci ) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 


M ,CWLLW, GGLPFSSDAQ ESPTFYSKTG PTVSSIVSMV LTNVSKTDPR H^SLVPLHP 
HD'TVLGCDA SVLLNNTATI VSEQQAFP, M NS.RGEDWU QI KTAVESAC PNTVSCADIL !20 
ALAQASSVLA QGPSWTVPLG RRDGLTANRT LAUQNEPAPF NSLDHLKLHL TAQGLITPVL 18 0 
VALSGAHTFG RAHCAQFVSR LYNFSSTGSP DPTLNTTYLQ QLRTIG PMGG PGTNLTNFDP 240 
TTPDKFDKNY YSNLQVKF.GL LQSDQELFST SGADTISIVD KFSTDQMAFF ESFF.AAMIKM 300 

347 

GNIGVLTGTK GEIRKQCNFV NSNSAELDLA TIASIVESLE DGIASVI 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: PRT 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

MLCT.S.TAFC CMVFVLIGGV PFSNAQLDPS FYNSTCSNLD SIVRGVLTNV SQSDPRMLGS 60 

LIRLHFHDCF VQGCDASILL NDTATIVSEQ SAPPN^SIR GLDVINQIKT AVENAC PNTV 120 

T<"c:T\iFr)NO 180 

SCADILALSA EISSDLANGP TWQVPLGRRD SLTANNSbAri 

ULSTTDLVAL SGGHTIGRGQ CRFFVDRLYN FSNTGNPDST LNTTYLQTLQ AICPNGGPGT 240 
NLTDLDPTTP DTFDSNYYSN LQVGKGLFQS DQELFSRWGS DTXS^SFA NNQTLFFENF 300 

VASHIKMGN! GVLTGSQGEI RTQCNAVNGN SSGLATWTK ESSEDGMASS F 35! 


